The Opacity Problem In Ml

▶ What are the three main questions the lecture on "The Opacity Problem in ML" addresses?

1. Why is opacity a problem? 2. What does the opacity problem look like in AI? 3. How to achieve transparency? (Page 2)

▶ What does "XAI" stand for?

eXplainable Artificial Intelligence (Page 4).

▶ Why is there an increasing interest in XAI?

AI is omnipresent, it is used in high-stakes contexts (like health care), and people have "automation bias" (Page 4, Page 5, Page 6).

▶ What is "automation bias"?

People tend to over-rely on automated systems, even when they know it's faulty (Page 6).

▶ What is "opacity" in the context of AI?

Humans often don't understand the model's decision logic (Page 7).

▶ What was the Dutch benefits fraud scandal used as an example of opacity?

The Dutch tax authorities falsely accused thousands of families of fraud, based on an opaque "risk assessment" tool (Page 8).

▶ What was the issue with Amazon's AI recruiting tool?

It taught itself that male candidates were preferable, penalizing resumes with words like "women's" or from all-women's colleges (Page 10).

▶ Why is opacity a problem (i.e., why is interpretability needed)?

For Model Validation, Knowledge Discovery, User Acceptance, Justification, Assessing Fairness, and Assessing Trustworthiness (Page 11).

▶ What is the central consequence of opacity?

If we don't know how a system works, we cannot trust, challenge, improve, or intervene on its outputs and we have no control / oversight (Page 12).

▶ What are some of the harms opacity can cause in high-stakes contexts?

Injustices, lack of recourse/contestability/accountability, and unsafety (Page 12).

▶ In the ML lifecycle diagram, what component is typically considered the "black box"?

The "Model" (Page 17).

▶ What is the "technical explainability problem"?

The challenge of using XAI methods to understand the "black box" model (Page 18).

▶ What is a saliency map or "heatmap" in XAI?

A visualization showing which parts of an input (like an image) are most important for the model's decision-making process (Page 19).

▶ What is a "feature importance ranking"?

A method that identifies and ranks the significance of individual input variables in influencing a model's predictions (Page 20).

▶ What is a "counterfactual explanation" in ML?

It describes the smallest change to the feature values that changes the prediction to a predefined output (Page 21).

▶ What is an example of a counterfactual explanation for a rejected loan application?

"If the applicant's income was €45.000, the loan would be approved." (Page 22).

▶ What are "interpretable models" as a type of XAI?

Models where you can directly interpret the model's decision logic (Page 23).

▶ What are "post-hoc explanations" as a type of XAI?

Methods that estimate the decision logic of an existing model (Page 23).

▶ What is the difference between "global" and "local" explanations?

Global explanations cover the model's overall logic, while local explanations cover a single prediction (Page 23).

▶ What is the difference between "model-specific" and "model-agnostic" explanations?

Model-specific methods leverage characteristics of a specific model class, while model-agnostic methods can work on any model (Page 23).

▶ What does Cynthia Rudin argue regarding high-stakes decisions?

Stop explaining black box models and use interpretable models instead (Page 23).

▶ Is the opacity problem limited to just the "black box" algorithm?

No, the lecture states it is much larger and is a sociotechnical problem that occurs in the entire AI ecosystem (Page 25, Page 43).

▶ What are the different stakeholders in the "real-world context of application" for an ML model?

Creators, Data subjects, Operators, Executors, Decision subjects / users, and Auditors (Page 28).

▶ In the credit scoring example, who is the "Operator"?

The front-office bank teller who enters the data (Page 29, Page 30).

▶ In the credit scoring example, who is the "Executor"?

The back-office analyst who uses the score to make a decision (Page 29, Page 30).

▶ In the generative AI (e.g., DALLE) example, who are the "data subjects"?

Artists, photographers, and designers whose images and designs are in the datasets (Page 31).

▶ According to Burrell (2016), what are the three sources of opacity?

Technical illiteracy, Intentional secrecy, and System complexity (Page 32).

▶ What is opacity from "technical illiteracy"?

A system is opaque because the stakeholder does not have enough technical skills to understand it (Page 33).

▶ What is opacity from "intentional secrecy"?

A system is opaque because a third party (like a developer) deliberately holds back information (Page 34).

▶ What is opacity from "system complexity"?

A system is opaque to all stakeholders simply because of its characteristics (size, complexity) (Page 35).

▶ How can data collection methods be opaque due to "intentional secrecy"?

Through the "non-disclosure of data collection methods" (Page 36).

▶ What did the lawyer who used ChatGPT for a court filing discover?

That ChatGPT had invented everything (the cases were false), and it even falsely claimed the cases were real when asked to verify (Page 40, Page 41).

▶ What is the "algorithmic divide"?

Inequalities in AI knowledge and understanding that exacerbate existing social inequalities (Page 42).

▶ Are XAI methods a "silver bullet" for opacity?

No, the lecture states they are not (yet?) a silver bullet (Page 45).

▶ What is the "fidelity" problem with post-hoc XAI methods?

They might not perfectly represent the model's true decision process, leading to misleading explanations (Page 46).

▶ What is the "robustness" problem with XAI methods?

Many XAI methods are unreliable and prone to adversarial attacks (Page 46).

▶ What is the "interpretability gap" with heatmaps?

The "hottest" parts of the map contain both useful and non-useful information, so it doesn't reveal *exactly* what the model used (Page 47).

▶ What is the "feasibility" problem with counterfactual explanations?

Not all counterfactuals are actionable or realistic (e.g., suggesting a person change their age to get a loan) (Page 48).

▶ What right does Recital 71 of the GDPR provide regarding automated decisions?

The right to obtain an explanation of the decision and to challenge the decision (Page 49).

▶ What right does the AI Act (Art. 86) provide to an affected person?

The right to obtain "clear and meaningful explanations of the role of the AI system in the decision-making procedure" (Page 49).

▶ Is transparency a goal in itself?

No, it is a means to an end; it's necessary to achieve other goals we value, like justice, safety, and accountability (Page 51).

▶ What three solutions are needed to increase transparency?

More robust XAI, strong regulation, and public education (Page 52).

The Opacity Problem in ML - Lecture 7